Lombard
Impact of Environmental Factors on LoRa 2.4 GHz Time of Flight Ranging Outdoors
Zhou, Yiqing, Zhou, Xule, Cheng, Zecan, Lu, Chenao, Chen, Junhan, Pan, Jiahong, Liu, Yizhuo, Li, Sihao, Kim, Kyeong Soo
In WSN/IoT, node localization is essential to long-running applications for accurate environment monitoring and event detection, often covering a large area in the field. Due to the lower time resolution of typical WSN/IoT platforms (e.g., 1 microsecond on ESP32 platforms) and the jitters in timestamping, packet-level localization techniques cannot provide meter-level resolution. For high-precision localization as well as world-wide interoperability via 2.4-GHz ISM band, a new variant of LoRa, called LoRa 2.4 GHz, was proposed by semtech, which provides a radio frequency (RF) time of flight (ToF) ranging method for meter-level localization. However, the existing datasets reported in the literature are limited in their coverages and do not take into account varying environmental factors such as temperature and humidity. To address these issues, LoRa 2.4 GHz RF ToF ranging data was collected on a sports field at the XJTLU south campus, where three LoRa nodes logged samples of ranging with a LoRa base station, together with temperature and humidity, at reference points arranged as a 3x3 grid covering 400 square meter over three weeks and uploaded all measurement records to the base station equipped with an ESP32-based transceiver for machine and user communications. The results of a preliminary investigation based on a simple deep neural network (DNN) model demonstrate that the environmental factors, including the temperature and humidity, significantly affect the accuracy of ranging, which calls for advanced methods of compensating for the effects of environmental factors on LoRa RF ToF ranging outdoors.
- Europe > Austria > Vienna (0.14)
- Asia > China > Shaanxi Province > Xi'an (0.05)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (3 more...)
- Telecommunications (0.56)
- Leisure & Entertainment (0.54)
KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud Provider
Wang, Jiahao, Han, Jinbo, Wei, Xingda, Shen, Sijie, Zhang, Dingyan, Fang, Chenguang, Chen, Rong, Yu, Wenyuan, Chen, Haibo
Serving large language models (LLMs) is important for cloud providers, and caching intermediate results (KV\$) after processing each request substantially improves serving throughput and latency. However, there is limited understanding of how LLM serving benefits from KV\$ caching, where system design decisions like cache eviction policies are highly workload-dependent. In this paper, we present the first systematic characterization of the KV\$ workload patterns from one of the leading LLM service providers. We draw observations that were not covered by previous studies focusing on synthetic workloads, including: KV\$ reuses are skewed across requests, where reuses between single-turn requests are equally important as multi-turn requests; the reuse time and probability are diverse considering all requests, but for a specific request category, the pattern tends to be predictable; and the overall cache size required for an ideal cache hit ratio is moderate. Based on the characterization, we further propose a workload-aware cache eviction policy that improves the serving performance under real-world traces, especially with limited cache capacity.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Santa Clara County > Santa Clara (0.05)
- (19 more...)
Learning To Communicate Over An Unknown Shared Network
Agarwal, Shivangi, Asija, Adi, Kaul, Sanjit K., Bhattacharya, Arani, Anand, Saket
As robots (edge-devices, agents) find uses in an increasing number of settings and edge-cloud resources become pervasive, wireless networks will often be shared by flows of data traffic that result from communication between agents and corresponding edge-cloud. In such settings, agent communicating with the edge-cloud is unaware of state of network resource, which evolves in response to not just agent's own communication at any given time but also to communication by other agents, which stays unknown to the agent. We address challenge of an agent learning a policy that allows it to decide whether or not to communicate with its cloud node, using limited feedback it obtains from its own attempts to communicate, to optimize its utility. The policy generalizes well to any number of other agents sharing the network and must not be trained for any particular network configuration. Our proposed policy is a DRL model Query Net (QNet) that we train using a proposed simulation-to-real framework. Our simulation model has just one parameter and is agnostic to specific configurations of any wireless network. It allows training an agent's policy over a wide range of outcomes that an agent's communication with its edge-cloud node may face when using a shared network, by suitably randomizing the simulation parameter. We propose a learning algorithm that addresses challenges observed in training QNet. We validate our simulation-to-real driven approach through experiments conducted on real wireless networks including WiFi and cellular. We compare QNet with other policies to demonstrate its efficacy. WiFi experiments involved as few as five agents, resulting in barely any contention for the network, to as many as fifty agents, resulting in severe contention. The cellular experiments spanned a broad range of network conditions, with baseline RTT ranging from a low of 0.07 second to a high of 0.83 second.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (15 more...)
- Telecommunications > Networks (0.48)
- Information Technology > Networks (0.34)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Characterizing Trust and Resilience in Distributed Consensus for Cyberphysical Systems
Yemini, Michal, Nedić, Angelia, Goldsmith, Andrea, Gil, Stephanie
This work considers the problem of resilient consensus where stochastic values of trust between agents are available. Specifically, we derive a unified mathematical framework to characterize convergence, deviation of the consensus from the true consensus value, and expected convergence rate, when there exists additional information of trust between agents. We show that under certain conditions on the stochastic trust values and consensus protocol: 1) almost sure convergence to a common limit value is possible even when malicious agents constitute more than half of the network connectivity, 2) the deviation of the converged limit, from the case where there is no attack, i.e., the true consensus value, can be bounded with probability that approaches 1 exponentially, and 3) correct classification of malicious and legitimate agents can be attained in finite time almost surely. Further, the expected convergence rate decays exponentially as a function of the quality of the trust observations between agents.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (3 more...)
Machine Learning Fleet Efficiency: Analyzing and Optimizing Large-Scale Google TPU Systems with ML Productivity Goodput
Wongpanich, Arissa, Oguntebi, Tayo, Paredes, Jose Baiocchi, Wang, Yu Emma, Phothilimthana, Phitchaya Mangpo, Mitra, Ritwika, Zhou, Zongwei, Kumar, Naveen, Reddi, Vijay Janapa
Recent years have seen the emergence of machine learning (ML) workloads deployed in warehouse-scale computing (WSC) settings, also known as ML fleets. As the computational demands placed on ML fleets have increased due to the rise of large models and growing demand for ML applications, it has become increasingly critical to measure and improve the efficiency of such systems. However, there is not yet an established methodology to characterize ML fleet performance and identify potential performance optimizations accordingly. This paper presents a large-scale analysis of an ML fleet based on Google's TPUs, introducing a framework to capture fleet-wide efficiency, systematically evaluate performance characteristics, and identify optimization strategies for the fleet. We begin by defining an ML fleet, outlining its components, and analyzing an example Google ML fleet in production comprising thousands of accelerators running diverse workloads. Our study reveals several critical insights: first, ML fleets extend beyond the hardware layer, with model, data, framework, compiler, and scheduling layers significantly impacting performance; second, the heterogeneous nature of ML fleets poses challenges in characterizing individual workload performance; and third, traditional utilization-based metrics prove insufficient for ML fleet characterization. To address these challenges, we present the "ML Productivity Goodput" (MPG) metric to measure ML fleet efficiency. We show how to leverage this metric to characterize the fleet across the ML system stack. We also present methods to identify and optimize performance bottlenecks using MPG, providing strategies for managing warehouse-scale ML systems in general. Lastly, we demonstrate quantitative evaluations from applying these methods to a real ML fleet for internal-facing Google TPU workloads, where we observed tangible improvements.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- (7 more...)
Specification Generation for Neural Networks in Systems
Chaudhary, Isha, Lin, Shuyi, Tan, Cheng, Singh, Gagandeep
Specifications - precise mathematical representations of correct domain-specific behaviors - are crucial to guarantee the trustworthiness of computer systems. With the increasing development of neural networks as computer system components, specifications gain more importance as they can be used to regulate the behaviors of these black-box models. Traditionally, specifications are designed by domain experts based on their intuition of correct behavior. However, this is labor-intensive and hence not a scalable approach as computer system applications diversify. We hypothesize that the traditional (aka reference) algorithms that neural networks replace for higher performance can act as effective proxies for correct behaviors of the models, when available. This is because they have been used and tested for long enough to encode several aspects of the trustworthy/correct behaviors in the underlying domain. Driven by our hypothesis, we develop a novel automated framework, SpecTRA to generate specifications for neural networks using references. We formulate specification generation as an optimization problem and solve it with observations of reference behaviors. SpecTRA clusters similar observations into compact specifications. We present specifications generated by SpecTRA for neural networks in adaptive bit rate and congestion control algorithms. Our specifications show evidence of being correct and matching intuition. Moreover, we use our specifications to show several unknown vulnerabilities of the SOTA models for computer systems.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (3 more...)
FastRAG: Retrieval Augmented Generation for Semi-structured Data
Abane, Amar, Bekri, Anis, Battou, Abdella
Efficiently processing and interpreting network data is critical for the operation of increasingly complex networks. Recent advances in Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) techniques have improved data processing in network management. However, existing RAG methods like VectorRAG and GraphRAG struggle with the complexity and implicit nature of semi-structured technical data, leading to inefficiencies in time, cost, and retrieval. This paper introduces FastRAG, a novel RAG approach designed for semi-structured data. FastRAG employs schema learning and script learning to extract and structure data without needing to submit entire data sources to an LLM. It integrates text search with knowledge graph (KG) querying to improve accuracy in retrieving context-rich information. Evaluation results demonstrate that FastRAG provides accurate question answering, while improving up to 90% in time and 85% in cost compared to GraphRAG.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > DuPage County > Lombard (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Telecommunications > Networks (0.35)
- Information Technology > Networks (0.35)
Towards Safer Heuristics With XPlain
Karimi, Pantea, Pirelli, Solal, Kakarla, Siva Kesava Reddy, Beckett, Ryan, Segarra, Santiago, Li, Beibin, Namyar, Pooria, Arzani, Behnaz
Many problems that cloud operators solve are computationally expensive, and operators often use heuristic algorithms (that are faster and scale better than optimal) to solve them more efficiently. Heuristic analyzers enable operators to find when and by how much their heuristics underperform. However, these tools do not provide enough detail for operators to mitigate the heuristic's impact in practice: they only discover a single input instance that causes the heuristic to underperform (and not the full set), and they do not explain why. We propose XPlain, a tool that extends these analyzers and helps operators understand when and why their heuristics underperform. We present promising initial results that show such an extension is viable.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > California > Orange County > Irvine (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- (12 more...)
Erasure Coded Neural Network Inference via Fisher Averaging
Jhunjhunwala, Divyansh, Jali, Neharika, Joshi, Gauri, Wang, Shiqiang
Erasure-coded computing has been successfully used in cloud systems to reduce tail latency caused by factors such as straggling servers and heterogeneous traffic variations. A majority of cloud computing traffic now consists of inference on neural networks on shared resources where the response time of inference queries is also adversely affected by the same factors. However, current erasure coding techniques are largely focused on linear computations such as matrix-vector and matrix-matrix multiplications and hence do not work for the highly non-linear neural network functions. In this paper, we seek to design a method to code over neural networks, that is, given two or more neural network models, how to construct a coded model whose output is a linear combination of the outputs of the given neural networks. We formulate the problem as a KL barycenter problem and propose a practical algorithm COIN that leverages the diagonal Fisher information to create a coded model that approximately outputs the desired linear combination of outputs. We conduct experiments to perform erasure coding over neural networks trained on real-world vision datasets and show that the accuracy of the decoded outputs using COIN is significantly higher than other baselines while being extremely compute-efficient.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Africa > Rwanda > Kigali > Kigali (0.04)
- North America > United States > New York (0.04)
- (4 more...)
LightningNet: Distributed Graph-based Cellular Network Performance Forecasting for the Edge
Zacharopoulos, Konstantinos, Koutroumpas, Georgios, Arapakis, Ioannis, Georgopoulos, Konstantinos, Khangosstar, Javad, Ioannidis, Sotiris
The cellular network plays a pivotal role in providing Internet access, since it is the only global-scale infrastructure with ubiquitous mobility support. To manage and maintain large-scale networks, mobile network operators require timely information, or even accurate performance forecasts. In this paper, we propose LightningNet, a lightweight and distributed graph-based framework for forecasting cellular network performance, which can capture spatio-temporal dependencies that arise in the network traffic. LightningNet achieves a steady performance increase over state-of-the-art forecasting techniques, while maintaining a similar resource usage profile. Our architecture ideology also excels in the respect that it is specifically designed to support IoT and edge devices, giving us an even greater step ahead of the current state-of-the-art, as indicated by our performance experiments with NVIDIA Jetson.
- Europe > United Kingdom (0.14)
- Asia > China (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (12 more...)
- Telecommunications (1.00)
- Information Technology > Networks (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)